Prince Edward Island
How AI can help protect bees from dangerous parasites
Tiny but mighty, honeybees play a crucial role in our ecosystems, pollinating various plants and crops. They also support the economy. These small producers contribute billions of dollars to Canada's agriculture industry, making Canada a major honey producer. However, in the winter of 2024, Canada's honey industry faced a severe collapse. Canada lost more than one-third of its beehives, primarily due to the widespread infestation of Varroa mites.
Using multi-agent architecture to mitigate the risk of LLM hallucinations
Amer, Abd Elrahman, Amer, Magdi
Recent advancements in Large Language Models (LLMs) have significantly enhanced the ability to develop systems that comprehend customer requests and determine the necessary actions to fulfill them. In today's competitive market, delivering superior custome r service is crucial for attracting and retaining clients. Satisfied customers are more likely to become loyal, repeat buyers, and advocate for your brand, leading to increased revenue and market share (Strikingly, 2024) . In industries characterized by intense competition, implementing LLM - based services that effectively address customer needs and enhance satisfaction is becoming a key determinant of a company's growth and success. By leveraging LLMs, businesses can deliver more personalized, efficient, and scalable support, and thereby improve customer experience and foster loyalty (Iopex, 2024) .
Climate land use and other drivers impacts on island ecosystem services: a global review
Moustakas, Aristides, Zemah-Shamir, Shiri, Tase, Mirela, Zotos, Savvas, Demirel, Nazli, Zoumides, Christos, Christoforidi, Irene, Dindaroglu, Turgay, Albayrak, Tamer, Ayhan, Cigdem Kaptan, Fois, Mauro, Manolaki, Paraskevi, Sandor, Attila D., Sieber, Ina, Stamatiadou, Valentini, Tzirkalli, Elli, Vogiatzakis, Ioannis N., Zemah-Shamir, Ziv, Zittis, George
Islands are diversity hotspots and vulnerable to environmental degradation, climate variations, land use changes and societal crises. These factors can exhibit interactive impacts on ecosystem services. The study reviewed a large number of papers on the climate change-islands-ecosystem services topic worldwide. Potential inclusion of land use changes and other drivers of impacts on ecosystem services were sequentially also recorded. The study sought to investigate the impacts of climate change, land use change, and other non-climatic driver changes on island ecosystem services. Explanatory variables examined were divided into two categories: environmental variables and methodological ones. Environmental variables include sea zone geographic location, ecosystem, ecosystem services, climate, land use, other driver variables, Methodological variables include consideration of policy interventions, uncertainty assessment, cumulative effects of climate change, synergistic effects of climate change with land use change and other anthropogenic and environmental drivers, and the diversity of variables used in the analysis. Machine learning and statistical methods were used to analyze their effects on island ecosystem services. Negative climate change impacts on ecosystem services are better quantified by land use change or other non-climatic driver variables than by climate variables. The synergy of land use together with climate changes is modulating the impact outcome and critical for a better impact assessment. Analyzed together, there is little evidence of more pronounced for a specific sea zone, ecosystem, or ecosystem service. Climate change impacts may be underestimated due to the use of a single climate variable deployed in most studies. Policy interventions exhibit low classification accuracy in quantifying impacts indicating insufficient efficacy or integration in the studies.
Comparative Analysis of Quantum and Classical Support Vector Classifiers for Software Bug Prediction: An Exploratory Study
Nadim, Md, Hassan, Mohammad, Mandal, Ashis Kumar, Roy, Chanchal K., Roy, Banani, Schneider, Kevin A.
Purpose: Quantum computing promises to transform problem-solving across various domains with rapid and practical solutions. Within Software Evolution and Maintenance, Quantum Machine Learning (QML) remains mostly an underexplored domain, particularly in addressing challenges such as detecting buggy software commits from code repositories. Methods: In this study, we investigate the practical application of Quantum Support Vector Classifiers (QSVC) for detecting buggy software commits across 14 open-source software projects with diverse dataset sizes encompassing 30,924 data instances. We compare the QML algorithm PQSVC (Pegasos QSVC) and QSVC against the classical Support Vector Classifier (SVC). Our technique addresses large datasets in QSVC algorithms by dividing them into smaller subsets. We propose and evaluate an aggregation method to combine predictions from these models to detect the entire test dataset. We also introduce an incremental testing methodology to overcome the difficulties of quantum feature mapping during the testing approach. Results: The study shows the effectiveness of QSVC and PQSVC in detecting buggy software commits. The aggregation technique successfully combines predictions from smaller data subsets, enhancing the overall detection accuracy for the entire test dataset. The incremental testing methodology effectively manages the challenges associated with quantum feature mapping during the testing process. Conclusion: We contribute to the advancement of QML algorithms in defect prediction, unveiling the potential for further research in this domain. The specific scenario of the Short-Term Activity Frame (STAF) highlights the early detection of buggy software commits during the initial developmental phases of software systems, particularly when dataset sizes remain insufficient to train machine learning models.
Quantum vs. Classical Machine Learning Algorithms for Software Defect Prediction: Challenges and Opportunities
Nadim, Md, Hassan, Mohammad, Mandal, Ashis Kumar, Roy, Chanchal K.
Software defect prediction is a critical aspect of software quality assurance, as it enables early identification and mitigation of defects, thereby reducing the cost and impact of software failures. Over the past few years, quantum computing has risen as an exciting technology capable of transforming multiple domains; Quantum Machine Learning (QML) is one of them. QML algorithms harness the power of quantum computing to solve complex problems with better efficiency and effectiveness than their classical counterparts. However, research into its application in software engineering to predict software defects still needs to be explored. In this study, we worked to fill the research gap by comparing the performance of three QML and five classical machine learning (CML) algorithms on the 20 software defect datasets. Our investigation reports the comparative scenarios of QML vs. CML algorithms and identifies the better-performing and consistent algorithms to predict software defects. We also highlight the challenges and future directions of employing QML algorithms in real software defect datasets based on the experience we faced while performing this investigation. The findings of this study can help practitioners and researchers further progress in this research domain by making software systems reliable and bug-free.
Tokenization as Finite-State Transduction
Cognetta, Marco, Okazaki, Naoaki
Tokenization is the first step in modern neural language model pipelines where an input text is converted to a sequence of subword tokens. We introduce from first principles a finite-state transduction framework which can efficiently encode all possible tokenizations of a regular language. We then constructively show that Byte-Pair Encoding (BPE) and MaxMatch (WordPiece), two popular tokenization schemes, fit within this framework. For BPE, this is particularly surprising given its resemblance to context-free grammar and the fact that it does not tokenize strings from left to right. An application of this is to guided generation, where the outputs of a language model are constrained to match some pattern. Here, patterns are encoded at the character level, which creates a mismatch between the constraints and the model's subword vocabulary. While past work has focused only on constraining outputs without regard to the underlying tokenization algorithm, our framework allows for simultaneously constraining the model outputs to match a specified pattern while also adhering to the underlying tokenizer's canonical tokenization.
Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning
Gao, Fengyu, Zhou, Ruida, Wang, Tianhao, Shen, Cong, Yang, Jing
Large Language Models (LLMs) rely on the contextual information embedded in examples/demonstrations to perform in-context learning (ICL). To mitigate the risk of LLMs potentially leaking private information contained in examples in the prompt, we introduce a novel data-adaptive differentially private algorithm called AdaDPSyn to generate synthetic examples from the private dataset and then use these synthetic examples to perform ICL. The objective of AdaDPSyn is to adaptively adjust the noise level in the data synthesis mechanism according to the inherent statistical properties of the data, thereby preserving high ICL accuracy while maintaining formal differential privacy guarantees. A key innovation in AdaDPSyn is the Precision-Focused Iterative Radius Reduction technique, which dynamically refines the aggregation radius - the scope of data grouping for noise addition - based on patterns observed in data clustering, thereby minimizing the amount of additive noise. We conduct extensive experiments on standard benchmarks and compare AdaDPSyn with DP few-shot generation algorithm (Tang et al., 2023). The experiments demonstrate that AdaDPSyn not only outperforms DP few-shot generation, but also maintains high accuracy levels close to those of non-private baselines, providing an effective solution for ICL with privacy protection.
Strategic Classification With Externalities
Chen, Yiling, Hossain, Safwan, Micha, Evi, Procaccia, Ariel
We propose a new variant of the strategic classification problem: a principal reveals a classifier, and $n$ agents report their (possibly manipulated) features to be classified. Motivated by real-world applications, our model crucially allows the manipulation of one agent to affect another; that is, it explicitly captures inter-agent externalities. The principal-agent interactions are formally modeled as a Stackelberg game, with the resulting agent manipulation dynamics captured as a simultaneous game. We show that under certain assumptions, the pure Nash Equilibrium of this agent manipulation game is unique and can be efficiently computed. Leveraging this result, PAC learning guarantees are established for the learner: informally, we show that it is possible to learn classifiers that minimize loss on the distribution, even when a random number of agents are manipulating their way to a pure Nash Equilibrium. We also comment on the optimization of such classifiers through gradient-based approaches. This work sets the theoretical foundations for a more realistic analysis of classifiers that are robust against multiple strategic actors interacting in a common environment.